Spanning Tree Based Attribute Clustering

نویسندگان

  • Yifeng Zeng
  • Jorge Cordero Hernandez
  • Shuyuan Lin
چکیده

Attribute clustering has been previously employed to detect statistical dependence between subsets of variables. We propose a novel attribute clustering algorithm motivated by research of complex networks, called the Star Discovery algorithm. The algorithm partitions and indirectly discards inconsistent edges from a maximum spanning tree by starting appropriate initial modes, therefore generating stable clusters. It discovers sound clusters through simple graph operations and achieves significant computational savings. We compare the Star Discovery algorithm against earlier attribute clustering algorithms and evaluate the performance in several domains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Attribute Clustering Based on Heuristic Tree Partition

Attribute clustering has been previously employed to detect statistical dependence between subsets of variables. Clusters of variables can be appropriately used for detecting highly dependent domain variables and then reducing the complexity of learning Bayesian networks. We propose a novel attribute clustering algorithm motivated by research of complex networks, called the Star Discovery algor...

متن کامل

A Stock Market Filtering Model Based on Minimum Spanning Tree in Financial Networks

There have been several efforts in the literature to extract as much information as possible from the financial networks. Most of the research has been concerned about the hierarchical structures, clustering, topology and also the behavior of the market network; but not a notable work on the network filtration exists. This paper proposes a stock market filtering model using the correlation - ba...

متن کامل

Fast and Improved Feature subset selection Algorithm Based Clustering for High Dimensional Data

The Clustering is a method of grouping the information into modules or clusters. Their dimensionality increases usually with a tiny number of dimensions that are significant to definite clusters, but data in the unrelated dimensions may produce much noise and wrap the actual clusters to be exposed. Attribute subset selection method is frequently used for data reduction through removing unrelate...

متن کامل

Constructing Minimal Spanning Tree Based on Rough Set Theory for Gene Selection

Microarray gene dataset often contains high dimensionalities which cause difficulty in clustering and classification. Datasets containing huge number of genes lead to increased complexity and therefore, degradation of dataset handling performance. Often, all the measured features of these high-dimensional datasets are not relevant for understanding the underlying phenomena of interest. Dimensio...

متن کامل

SOLVING A STEP FIXED CHARGE TRANSPORTATION PROBLEM BY A SPANNING TREE-BASED MEMETIC ALGORITHM

In this paper, we consider the step fixed-charge transportation problem (FCTP) in which a step fixed cost, sometimes called a setup cost, is incurred if another related variable assumes a nonzero value. In order to solve the problem, two metaheuristic, a spanning tree-based genetic algorithm (GA) and a spanning tree-based memetic algorithm (MA), are developed for this NP-hard problem. For compa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009